Parallel algorithms for Burrows-Wheeler compression and decompression
نویسندگان
چکیده
We present work-optimal PRAM algorithms for Burrows-Wheeler compression and decompression of strings over a constant alphabet. For a string of length n, the depth of the compression algorithm is O(log n), and the depth of the the corresponding decompression algorithm is O(log n). These appear to be the first polylogarithmic-time work-optimal parallel algorithms for any standard lossless compression scheme. The algorithms for the individual stages of compression and decompression may also be of independent interest: 1. a novel O(log n)-time, O(n)-work PRAM algorithm for Huffman decoding; 2. original insights into the stages of the BW compression and decompression problems, bringing out parallelism that was not readily apparent, allowing them to be mapped to elementary parallel routines that have O(log n)-time, O(n)-work solutions, such as: (i) prefix-sums problems with an appropriately-defined associative binary operator for several stages, and (ii) list ranking for the final stage of decompression.
منابع مشابه
Empirical Speedup Study of Truly Parallel Data
We present an empirical study of novel work-optimal parallel algorithms for Burrows-Wheeler compression and decompression of strings over a constant alphabet. To validate these theoretical algorithms, we implement them on the experimental XMT computing platform developed especially for supporting parallel algorithms at the University of Maryland. We show speedups of up to 25x for compression, a...
متن کاملImprovements to the Burrows-Wheeler Compression Algorithm: After BWT Stages
The lossless Burrows-Wheeler Compression Algorithm has received considerable attention over recent years for both its simplicity and effectiveness. It is based on a permutation of the input sequence − the Burrows-Wheeler Transform − which groups symbols with a similar context close together. In the original version, this permutation was followed by a Move-To-Front transformation and a final ent...
متن کاملMedium-Space Algorithms for Inverse BWT
The Burrows–Wheeler transform is a powerful tool for data compression and has been the focus of intense research in the last decade. Little attention, however, has been paid to the inverse transform, even though it is a bottleneck in decompression. We introduce three new inversion algorithms with improved performance in a wide range of the space-time spectrum, as confirmed by both theoretical a...
متن کاملDynamic Decompression for Text Files
Compression algorithms reduce the redundancy in data representation to decrease the storage required for that data. Lossless compression researchers have developed highly sophisticated approaches, such as Huffman encoding, arithmetic encoding, the Lempel-Ziv (LZ) family, Dynamic Markov Compression (DMC), Prediction by Partial Matching (PPM), and Burrows-Wheeler Transform (BWT) based algorithms....
متن کاملFountain codes for lossless data compression
This paper proposes a universal variable-length lossless compression algorithm based on fountain codes. The compressor concatenates the Burrows-Wheeler block sorting transform (BWT) with a fountain encoder, together with the closed-loop iterative doping algorithm. The decompressor uses a Belief Propagation algorithm in conjunction with the iterative doping algorithm and the inverse BWT. Linear-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Theor. Comput. Sci.
دوره 525 شماره
صفحات -
تاریخ انتشار 2014